Extended Translation Models in Phrase-based Decoding

نویسندگان

  • Andreas Guta
  • Joern Wuebker
  • Miguel Graça
  • Yunsu Kim
  • Hermann Ney
چکیده

We propose a novel extended translation model (ETM) to counteract some problems in phrase-based translation: The lack of translation context when using singleword phrases and uncaptured dependencies beyond phrase boundaries. The ETM operates on word-level and augments the IBM models by an additional bilingual word pair and a reordering operation. Its implementation in a phrase-based decoder introduces translation and reordering dependencies for single-word phrases and dependencies across phrase boundaries. More, the model incorporates an explicit treatment of multiple and empty alignments. Its integration outperforms competitive systems that include lexical and phrase translation models as well as hierarchical reordering models on 4 language pairs significantly by +0.7% BLEU on average. Although simpler and using fewer dependencies, the ETM proves to be on par with 7-gram operation sequence models (Durrani et al., 2013b).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation

We present a new open source toolkit for phrase-based and syntax-based machine translation. The toolkit supports several state-of-the-art models developed in statistical machine translation, including the phrase-based model, the hierachical phrase-based model, and various syntaxbased models. The key innovation provided by the toolkit is that the decoder can work with various grammars and offers...

متن کامل

Efficient Incremental Decoding for Tree-to-String Translation

Syntax-based translation models should in principle be efficient with polynomially-sized search space, but in practice they are often embarassingly slow, partly due to the cost of language model integration. In this paper we borrow from phrase-based decoding the idea to generate a translation incrementally left-to-right, and show that for tree-to-string models, with a clever encoding of derivat...

متن کامل

Investigations on Phrase-based Decoding with Recurrent Neural Network Language and Translation Models

This work explores the application of recurrent neural network (RNN) language and translation models during phrasebased decoding. Due to their use of unbounded context, the decoder integration of RNNs is more challenging compared to the integration of feedforward neural models. In this paper, we apply approximations and use caching to enable RNN decoder integration, while requiring reasonable m...

متن کامل

Integrating Translation Memory into Phrase-Based Machine Translation during Decoding

Since statistical machine translation (SMT) and translation memory (TM) complement each other in matched and unmatched regions, integrated models are proposed in this paper to incorporate TM information into phrase-based SMT. Unlike previous multi-stage pipeline approaches, which directly merge TM result into the final output, the proposed models refer to the corresponding TM information associ...

متن کامل

Enriching Phrase-Based Statistical Machine Translation with POS Information

This work presents an extension to phrasebased statistical machine translation models which incorporates linguistic knowledge, namely part-of-speech information. Scores are added to the standard phrase table which represent how the phrases correspond to their translations on the partof-speech level. We suggest two different kinds of scores. They are learned from a POS-tagged version of the para...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015